Search results for "learner corpus"

showing 4 items of 4 documents

Comparing formulaicity of learner writing through phrase-frames: a corpus-driven study of Lithuanian and Polish EFL student writing

2018

Learner corpus research continues to provide evidence of how formulaic language is (mis)used by learners of English as a foreign language (EFL). This paper deals with less investigated multi-word units in EFL contexts, namely, phrase-frames (Fletcher 2002–2007), i.e. sets of n-grams identical except for one word (it is * to, in the * of). The study compares Lithuanian and Polish learner writing in English in terms of phrase-frames and contrasts them with native speakers. The analysis shows that certain differences between Lithuanian and Polish learners result from transfer from their native languages, yet both groups of learners share many common features. Most importantly, the phrase-frame…

060201 languages & linguisticsLearner corpusLinguistics and LanguageLithuanian EFL learnersPhraseEnglish as a foreign language06 humanities and the artsLithuanianEFL writing ; learner corpus ; Lithuanian EFL learners ; phrase-frame ; Polish EFL learnersphrase-frameLanguage and Linguisticslanguage.human_languageLinguisticsEFL writingLietuva (Lithuania)0602 languages and literaturelanguagePhrase-framelearner corpusStudent writingPsychologyPolish EFL learners
researchProduct

Analysing Lexical Density and Lexical Diversity in University Students’ Written Discourse

2015

Abstract This study analyses both lexical density and lexical diversity in the written production of two groups of first year students at the Universitat de Valencia at the beginning and end of one-semester teaching period. These results were compared with those obtained by a third group of students aiming at level C2. Lexical density was tested using Textalyser ( http://textalyser.net ) and lexical frequency used the software RANGE (Nation and Heatly, 1994). Our results prove that the students from both groups at level B1 show the same progression between writing tasks 1 and 3. Furthermore, we can claim that it is possible to obtain a reliable measure of lexical richness which is stable ac…

Learner corpusLexical densityLexical densityLexical functional grammarComputer scienceAnglèsLexical diversityGeneral Materials ScienceEnglish Language Teaching.LinguisticsPeriod (music)Lexical diversityProcedia - Social and Behavioral Sciences
researchProduct

Establishing a Standardised Procedure for Building Learner Corpora

2014

Decisions at the outset of preparing a learner corpus are of crucial importance for how the corpus can be built and how it can be analysed later on. This paper presents a generic workflow to build learner corpora while taking into account the needs of the users. The workflow results from an extensive collaboration between linguists that annotate and use the corpus and computer linguists that are responsible for providing technical support. The paper addresses the linguists’ research needs as well as the availability and usability of language technology tools necessary to meet them. We demonstrate and illustrate the relevance of the workflow using results and examples from our L1 learner cor…

corpus building workflowL1 learner corpusGerman as a first language
researchProduct

Using Automatic Morphological Tools to Process Data from a Learner Corpus of Hungarian

2014

The aim of this article is to show how automatic morphological tools originally used to analyze native speaker data can be applied to process data from a learner corpus of Hungarian. We collected written data from 35 students majoring in Hungarian studies at the University of Zagreb, Croatia. The data were analyzed by magyarlanc, a sentence splitter, morphological analyzer, POS-tagger and dependency parser, which found 667 unknown word forms. We investigated the recommendations made by the Hungarian spellchecker hunspell for these unknown words and the correct forms were manually chosen. It was found that if the first suggestion made by hunspell was automatically accepted, an accuracy score…

morphological parsingautomatic error tagginglearner corpusnatural language processingHungarian language
researchProduct